3 research outputs found

    Use of Coherent Point Drift in computer vision applications

    Get PDF
    This thesis presents the novel use of Coherent Point Drift in improving the robustness of a number of computer vision applications. CPD approach includes two methods for registering two images - rigid and non-rigid point set approaches which are based on the transformation model used. The key characteristic of a rigid transformation is that the distance between points is preserved, which means it can be used in the presence of translation, rotation, and scaling. Non-rigid transformations - or affine transforms - provide the opportunity of registering under non-uniform scaling and skew. The idea is to move one point set coherently to align with the second point set. The CPD method finds both the non-rigid transformation and the correspondence distance between two point sets at the same time without having to use a-priori declaration of the transformation model used. The first part of this thesis is focused on speaker identification in video conferencing. A real-time, audio-coupled video based approach is presented, which focuses more on the video analysis side, rather than the audio analysis that is known to be prone to errors. CPD is effectively utilised for lip movement detection and a temporal face detection approach is used to minimise false positives if face detection algorithm fails to perform. The second part of the thesis is focused on multi-exposure and multi-focus image fusion with compensation for camera shake. Scale Invariant Feature Transforms (SIFT) are first used to detect keypoints in images being fused. Subsequently this point set is reduced to remove outliers, using RANSAC (RANdom Sample Consensus) and finally the point sets are registered using CPD with non-rigid transformations. The registered images are then fused with a Contourlet based image fusion algorithm that makes use of a novel alpha blending and filtering technique to minimise artefacts. The thesis evaluates the performance of the algorithm in comparison to a number of state-of-the-art approaches, including the key commercial products available in the market at present, showing significantly improved subjective quality in the fused images. The final part of the thesis presents a novel approach to Vehicle Make & Model Recognition in CCTV video footage. CPD is used to effectively remove skew of vehicles detected as CCTV cameras are not specifically configured for the VMMR task and may capture vehicles at different approaching angles. A LESH (Local Energy Shape Histogram) feature based approach is used for vehicle make and model recognition with the novelty that temporal processing is used to improve reliability. A number of further algorithms are used to maximise the reliability of the final outcome. Experimental results are provided to prove that the proposed system demonstrates an accuracy in excess of 95% when tested on real CCTV footage with no prior camera calibration

    Contourlet based multi-exposure image fusion with compensation for multi-dimensional camera shake

    Get PDF
    Multi-exposure image fusion algorithms are used for enhancing the perceptual quality of an image captured by sensors of limited dynamic range by rendering multiple images captured at different exposure settings. One practical problem overlooked by existing algorithms is the compensation required for image deregistration due to possible multi-dimensional camera shake that results within the time gap of capturing the multiple exposure images. In our approach RANdom SAmple Consensus (RANSAC) algorithm is used to identify inliers of key-points identified by the Scale Invariant Feature Transform (SIFT) approach subsequently to the use of Coherent Point Drift (CPD) algorithm to register the images based on the selected set of key points. We provide experimental results on set of images with multi-dimensional (translational and rotational) to prove the proposed algorithm's capability to register and fuse multiple exposure images taken in the presence of camera shake providing subjectively enhanced output images

    Subjectively optimised multi-exposure and multi-focus image fusion with compensation for camera shake

    Get PDF
    Multi-exposure image fusion algorithms are used for enhancing the perceptual quality of an image captured by sensors of limited dynamic range. This is achieved by rendering a single scene based on multiple images captured at different exposure times. Similarly, multi-focus image fusion is used when the limited depth of focus on a selected focus setting of a camera results in parts of an image being out of focus. The solution adopted is to fuse together a number of multi-focus images to create an image that is focused throughout. In this paper we propose a single algorithm that can perform both multi-focus and multi-exposure image fusion. This algorithm is a novel approach in which a set of unregistered multiexposure/focus images is first registered before being fused. The registration of images is done via identifying matching key points in constituent images using Scale Invariant Feature Transforms (SIFT). The RANdom SAmple Consensus (RANSAC) algorithm is used to identify inliers of SIFT key points removing outliers that can cause errors in the registration process. Finally we use the Coherent Point Drift algorithm to register the images, preparing them to be fused in the subsequent fusion stage. For the fusion of images, a novel approach based on an improved version of a Wavelet Based Contourlet Transform (WBCT) is used. The experimental results as follows prove that the proposed algorithm is capable of producing HDR, or multi-focus images by registering and fusing a set of multi-exposure or multi-focus images taken in the presence of camera shake
    corecore